$webwork.htmlEncode($page.space.name) : 6 GeoServer in Production Environment

This page last changed on Dec 13, 2007 by aaime.

GeoServer in Production

Most of the GeoServer downloads are geared towards quickly showing off the capabilities, with an array of demos, sample layers, and an embedded servlet container. If you are using GeoServer in a production environment, there are a few things we'd like to recommend.

Choose carefully your JDK

Geoserver speed depends a lot on the chosen JDK, especially the WMS subsystem.
We don't have a good benchmarks at the moment, but informal tests show a marked speed increase going from jdk 1.4 to jdk 1.5, and another pretty good increse going to jdk 1.6, which is the best overall performer.

So, if your production environment allows for it, better use at least jdk 1.5, and if you can, shoot for jdk 1.6.

Install native JAI extensions

Geoserver 1.5.0 and onwards requires JAI to work with coverages, and leverages it for WMS output generation too. By default we do ship with the pure java version of JAI in the classpath, but if you want best performance you should install the native version in your JDK. See Dealing with native JAI for more detailed information, and for troubleshooting some issues with some container classloaders and JDK 1.4.

On Geoserver 1.5.0/1.5.1 there is a issue (see GEOS-1042 for details) with the native JAI extensions and JPEG output that sometimes results in bizarre colour output if using native acceleration. You will likely want to disable "JAI JPEG native acceleration" under Config | Server in the web console, or upgrade to Geoserver 1.5.2 where the issue has been fixed.

Configure your container for production

Most open source Java web containers, such as Tomcat or Jetty, ship with development mode configurations that allow for quick startup but don't deliver the best performance.

Find a way to set up the Java virtual machine options in your container and set the following options:

-server: this enables the server JVM, which JIT compiles bytecode much earlier, and with stronger optimizations. Startup and first calls will be slower due to JIT compilation taking more time, but subsequent ones will be faster (to give you some numbers, on the same machine a vanilla VM returns GML at 7MB/s speed, a -server one runs at 10MB/s);
-Xms48m -Xmx256M: give your server memory. By default JVM will use only 64MB of heap. If you're serving just vector data, you'll be full streaming, so having much memory won't help a lot, but if you're serving coverages JAI will use a cache to avoid hitting the disk often. In this case, give Geoserver at least 256MB or memory, or more if you have plenty of RAM, and go configure the JAI title cache size in the Geoserver configuration panel so that it uses 75% of the heap (0.75). -Xmx48m will tell the virtual machine to grab a 48m heap on startup, this will make heap management more stable during heavy load serving.
-XX:SoftRefLRUPolicyMSPerMB=36000: this one makes soft references live longer. Geoserver uses soft references to cache data store references and the like, making them live longer will increase the effectiveness of the cache (the cache is there to allow Geosever to scale to thousands of data store, something not very common, but we have a few users pushing Geoserver there).

Also, check your server documentation to see if there are other tweaks you can use to speed up the server in a production environment (for example, we found Tomcat 6 is serving maps faster when the APR runtinme is installed).

Finally, if you're using an open source container, you may want to try out a few and assess their performance under stress, you may find that some are faster than others (we don't have benchmarks on this at the moment).

Set up logging for production

Logging may visibly affect the performance of your server. High logging levels are often necessary to track down issues, but by default you should run with low ones (and you can swith the logging levels at runtime, so don't worry about having to stop the server to gather more information).
You can change the logging level by going to the GeoServer configuration panel, Server section.
By default Geoserver 1.5.x and earlier version ship with CONFIGURE logging level, but for production you may want to set it to WARNING, or SEVERE (and enable file logging, since most probably for production you'll run GeoServer as a service and thus won't be able to see the console output).
Geoserver 1.6.0 onwards has configuration sets instead, and you'll probably want to choose the PRODUCTION configuration (file logging should be enabled by default).

Choose a service strategy

A service strategy is the way we serve the output to the client. Basically, you have to choose between being absolutely sure of reporting errors with the proper OGC codes and serving output quickly. You can configure the service strategy modifying the web.xml file of your Geoserver install, and the possible strategies are:

SPEED: serve outputs right away. The fastest strategy, make it almost impossible to report proper OGC errors in WFS thought.
BUFFER: stores the whole result in memory, and then serves it after the output is complete. This ensures proper OGC error reporting, but delays the response quite a bit and will exhaust memory if the response is big;
FILE: same as buffer, but uses a file storage for buffering. Slower than BUFFER, ensures there won't be memory issues.
PARTIAL-BUFFER: a balance between the two, tries to buffer in memory a few kilobytes of response, then behaves like SPEED. Unfortunately, up to Geoserver 1.5.0 it had speed issues. If you're using Geoserver 1.5.0 you can configure PARTIAL-BUFFER2, a replacement that has no speed issues, if you're on 1.4.0 and need maximum performance, you may want to choose SPEED.

Use a Spatial Database

We make shapefiles available as a datastore, as they are such a common format. But if you are running GeoServer in a production environment setting up a spatial database and converting your shapefiles is highly recommended. If you're doing transactions against GeoServer this is essential. Even though we have a very nice transaction framework, doubling up with the native transaction support of relational databases ensures your data integrity. Most all the major spatial dbs provide support to easily turn shapefiles into their native format. We recommend PostGIS, open source extensions to the postgresql db, most of our testing has been performed against it. Oracle, DB2, and ArcSDE are also well supported. At the moment we don't recommend MySQL, as it has trouble with rollbacks on geometry tables, and lacks advanced spatial functionality, but it is an option.

Use the best performing coverage formats

The are very significant differences between performance of the various coverage formats.
Read the documentation on high performance coverage serving and make sure you've prepared your data for optimal web serving performance.

Turn on GZIP compression for selected mime types

Various GeoServer outputs are highly compressible, and the HTTP standard foresees a transparent way to have contents compressed on the fly known as the GZIP compression. This may reduce 10 times or more the actual amount of data travelling over the net, making response faster and allowing more clients to hit the same server. The same goes for the OpenLayers previews, since the OL source is relatively big.
uDig has noticed significant performance improvements when this is turned on. And it doesn't harm anyone when turned on as the negotiation is all part of the http protocol. Most servlet containers should support the option.

Tomcat 5.5 standard connector configuration found in the %TOMCAT_HOME%/conf/server.xml

Look for the service connector you wish to configure for compression and add two attributes, 'compression' and 'compressableMimeType'.

For example:
    <Connector
  port="80" maxHttpHeaderSize="8192"
  URIEncoding="UTF-8"
  maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
  enableLookups="false" redirectPort="8443" acceptCount="100"
  connectionTimeout="20000" disableUploadTimeout="true"
  compression="on" compressableMimeType="text/html,text/xml,text/plain,application/xml,application/vnd.google-earth.kml+xml,application/json,application/x-javascript"
  />
For more information refer to: http://tomcat.apache.org/tomcat-5.0-doc/config/http.html

Make use of a Data Directory to store your configurations

In older versions of GeoServer the configuration was stored directly in the servlet container. We're in the process of taking this out into its own directory. As of 1.3.0 RC7, the binary releases will come configured for this by default, in the conf/ directory. The war is still in the process of transition, since it's harder for us to easily enforce war users to make use of the data dir, since more control is in the hands of the administrator. But using a data_dir allows for much easier upgrades, since there is no risk of config information being overwritten. You just plug in the new war, and point it at the same data directory. You can even configure using a binary release, and then point your war from your servlet container at the same data directory. The data dir also makes it easy to pass your configuration to someone else, as you can just zip it up and email it along, it's all in one place, where the embedded config was spread out.

Configure all data and metadata to your instance.

It may be tempting to just skip some of the configuration steps, leave in the same keywords and abstract as the sample. Please do not, as this will only confuse potential users. They will have a list of GeoServers called 'My GeoServer'. Completely fill out the WFS and WMS Contents sections. Put in your own URI (such as the name of your website) for the Namespace (Config -> Data -> Namespace), and remove the defaults. Make sure your datastores all use your URI. Remove the sample featureTypes, so there is not always a 'states' layer. You should also delete all the 'demo' functionality that comes in GeoServer, since it will no longer work. The two main ones are in data/mbdemos, and data/demo/popup_map. In the future we will make available an 'Empty' configuration, where you won't have to delete things. If anyone is interested now just email the list.

Move client functionality to its own war

In the default installs, we include a couple sample clients, embedded right in GeoServer. We recommend against doing this for production instances, since an upgrade could blow up the old configs. The best way around this is to create your own war, and put it next to GeoServer. From there you can still use relative links and whatnot, it's just one layer up. MapBuilder has a distribution that comes in a war, and from there you could easily customize a client geared towards your layer. If you are just trying out GeoServer, playing with MapBuilder within GeoServer is fine, but we recommend against going live with it. Also, if you keep the client separate, it makes it a lot easier to distribute with the configuration. You just send someone the zipped data directory and the war, and they can just drop in the war beside GeoServer, and point GeoServer at the data dir.

Set security

GeoServer by default includes WFS-T, which lets users modify your backend database. If you don't want that to happen, you can turn off transactions in the web admin tool, Config -> WFS -> Content and set Service Level to Basic. If you'd like some users to be able to modify it, but not all, you'll have to set up an external security service. One easy way to do this is to run two GeoServers at different urls or ports, and configure them differently, and use http or some other simple authentication to only allow the right users to have access. In the future we should have a better security framework at the GeoServer level to do this, but for now users have found success just putting security on the outside.

For extra security, make sure that the connection to the datastore that is open to all is with a user who has read only permissions. That will make it so it's completely impossible to do a SQL injection (though GeoServer is generally designed well enough that it's not vulnerable).

Dealing with a locked down environment

GeoServer code, and the libraries it uses (Geotools, JAI) are not designed to be run in a security locked down enviroment. They need free access to environment variables, temp directory, user preferences and the like.

In operating systems like Ubuntu the default Tomcat is locked down so that most of the above is not authorized. So far, the only way to run Geoserver in that environment is to grant all permissions to it. If you are running Ubuntu, this blog entry contains a quite detailed set of instructions on how to setup GeoServer with the distribution provided Tomcat package: http://grimmeister.wordpress.com/2007/08/08/setting-up-an-open-geospatial-consortium-service-server.

Create a link to you capabilities document

If there was a useful global catalog this step would be to register there. But as there's not, the next best thing to do is to put a link to your capabilities document somewhere on the web. This will ensure that a search engine will crawl and index it. Be sure that the link includes a 'request=getcapabilities' somewhere in it, as that's how google scrappers are finding the capabilities documents. This should then show up in indexes like mapdex and Refraction's google scrapper.

Caching

Server-side caching of WMS tiles is the best way to get performance. Essentially how the caching works is the server will recognize a request and quickly return a pre-rendered result. This is how you can optimize for tile-based WMS clients and it works the best for them. It is hard to cache for regular WMS getMap requests because the URL might be slightly different each time because of the bounding box. There are several ways to set up caching for GeoServer. You can use an embedded cache service called OSCache, or use a tool called Squid to perform the caching. Meta carta has also come up with a server-side caching solution that you can get here.

For a bit of information on what one user did to get a very scalable GeoServer running, see Clustering and Caching GeoServer. And if you have more experience running GeoServer in production, feel free to add it here or create a Use Narrative, just an informal sort of blog-ish post on what you did. Also, if you need to run GeoServer behind a proxy, see Configure GeoServer to run with a proxy.

Comments:

For linux systems, I have noticed very positive performance enhancements by compiling and installing the Tomcat native libraries. Explicit details for this can be found at http://tomcat.apache.org/tomcat-5.5-doc/apr.html.

Posted by heimdallr at Dec 21, 2007 01:25

Document generated by Confluence on Jan 16, 2008 23:27